pacman::p_load(sf, tidyverse, tmap, spNetwork, sp,spatstat ,dplyr)Takehome_Ex01
1 Setting the Scene
Road traffic accidents are a significant global issue, causing approximately 1.19 million deaths and leaving 20 to 50 million people with non-fatal injuries annually, according to the World Health Organization (WHO). Vulnerable road users, like pedestrians, cyclists, and motorcyclists, make up more than half of these fatalities, with most incidents occurring in low- and middle-income countries. Thailand, in particular, has one of the highest road traffic death rates in Southeast Asia, with around 20,000 deaths each year. Economic impacts are also substantial, costing countries about 3% of their GDP. In Thailand, national highways see the highest concentration of accidents, especially in straight road sections and other accident-prone zones like intersections and curves.
2 Objectives
The objectives of this take-home exercise are as follows:
-To visualize the spatio-temporal dynamics of road traffic accidents in BMR using appropriate statistical graphics and geovisualization methods. -To conduct detailed spatial analysis of road traffic accidents using appropriate Network Spatial Point Patterns Analysis methods. -To conduct detailed spatio-temporal analysis of road traffic accidents using appropriate Temporal Network Spatial Point Patterns Analysis methods.
3 Data Collection
3.1 Loading Packages
3.2 Data Collect
Three data sets are used in this exercise, they are:
Thailand Road Accident [2019-2022] on Kaggle. This dataset offers comprehensive statistics on road accidents recorded in Thailand from approximately 2019 to 2022, covering various aspects of the incidents.
Thailand Roads (OpenStreetMap Export) on HDX. This dataset, sourced from OpenStreetMap, provides a detailed map of Thailand’s road network.
Thailand - Subnational Administrative Boundaries on HDX. This dataset provides comprehensive geographic data on Thailand’s subnational administrative boundaries, covering provinces, districts, and subdistricts.
4 KDE analysis
In this part, we will conduct a kernel density analysis for the Bangkok Metropolitan Region (BMR) to simply visualize and identify areas with a higher concentration of car accidents.
4.1 Data Preparation
4.4.1 Data import
acc <- read.csv("data/aspatical/thai_road_accident_2019_2022.csv") tr <- st_read(dsn="data/geospatial", layer = 'hotosm_tha_roads_lines_shp')Reading layer `hotosm_tha_roads_lines_shp' from data source
`D:\FuWanqian\ISSS608-VAA\Takehome_Ex\Takehome_Ex01\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 2792361 features and 14 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: 97.34457 ymin: 5.643645 xmax: 105.6528 ymax: 20.47168
Geodetic CRS: WGS 84
ad <- st_read(dsn="data/geospatial", layer = 'tha_admbnda_adm1_rtsd_20220121')Reading layer `tha_admbnda_adm1_rtsd_20220121' from data source
`D:\FuWanqian\ISSS608-VAA\Takehome_Ex\Takehome_Ex01\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 77 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 97.34336 ymin: 5.613038 xmax: 105.637 ymax: 20.46507
Geodetic CRS: WGS 84
Here we first import the administrative level 1 data, which includes the provincial boundaries of Thailand.
4.4.2 Data selection
bmr_acc<- acc %>%
filter(province_en %in% c("Bangkok", "Nakhon Pathom", "Pathum Thani",
"Nonthaburi", "Samut Prakan", "Samut Sakhon"))bmr_ad<- ad %>%
filter(ADM1_EN %in% c("Bangkok", "Nakhon Pathom", "Pathum Thani",
"Nonthaburi", "Samut Prakan", "Samut Sakhon"))Here we filter the BMR data from both the accident data and the provincial boundary data of Thailand.
4.4.3 Drop missing value for accident data
missing_coords <- bmr_acc[is.na(bmr_acc$longitude) | is.na(bmr_acc$latitude), ]
missing_coords_count <- nrow(missing_coords)
missing_coords_count[1] 350
The number of missing coordinates is 350, so using below codes to drop missing value.
bmr_acc2 <- bmr_acc[!is.na(bmr_acc$longitude) & !is.na(bmr_acc$latitude), ]4.4.4 Transform data format
It is important for us to ensure that all the data are projected in same projection system, then here we transform BMR accident data and BMR boundary to EPSG:32647 (UTM Zone 47N).
bmr_acc_data <- st_as_sf(bmr_acc2, coords = c("longitude", "latitude"), crs = 4326)
bmr_acc_data_utm <- st_transform(bmr_acc_data, crs = 32647)Simple feature collection with 12986 features and 16 fields
Geometry type: POINT Dimension: XY
Bounding box: xmin: 591277.5 ymin: 1486846 xmax: 710166.1 ymax: 1576520
Projected CRS: WGS 84 / UTM zone 47N
bmr_ad_data_utm <- st_transform(bmr_ad, crs = 32647)Simple feature collection with 6 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 587893.5 ymin: 1484414 xmax: 712440.5 ymax: 1579076
Projected CRS: WGS 84 / UTM zone 47N
4.2 Mapping the data
plot(st_geometry(bmr_ad_data_utm), col = "lightgrey", border = "black", main = "BMR Accident Locations (Base R)")
plot(st_geometry(bmr_acc_data_utm), col = "black", pch = 19, cex = 0.1, add = TRUE)
tm_shape(bmr_ad_data_utm) +
tm_borders(col = "black", lwd = 1) + borders
tm_shape(bmr_acc_data_utm) +
tm_dots(col = "black", size = 0.01) +
tm_layout(title = "BMR Accident Locations (tmap)")


The first plot shows the locations of traffic accidents in the Bangkok Metropolitan Region, with black dots concentrated in the city center and along major roads, especially near intersections and busy traffic areas. Accident density varies across different provinces, with hotspots mainly distributed along major roads.
4.3 Convert data to ppp format
4.3.1 sp format
bmr_acc_data_sp <- as_Spatial(bmr_acc_data_utm)
bmr_boundary_sp <- as_Spatial(bmr_ad_data_utm)4.3.2 ppp format
acc_coords <- coordinates(bmr_acc_data_sp)
bbox_values <- bbox(bmr_boundary_sp)bmr_window <- owin(xrange = c(bbox_values[1, 1], bbox_values[1, 2]),
yrange = c(bbox_values[2, 1], bbox_values[2, 2]))bmr_acc_ppp <- ppp(x = acc_coords[, 1], y = acc_coords[, 2], window = bmr_window)plot(bmr_acc_ppp, main = "BMR Accident Locations", cex = 0.5)
4.4 Handling duplicate points
4.4.1 check duplicates
sum(multiplicity(bmr_acc_ppp) > 1)[1] 2293
4.4.2 Apply jitter to handle duplicates
bmr_acc_ppp_jit <- rjitter(bmr_acc_ppp, retry = TRUE, nsim = 1, drop = TRUE)
any(duplicated(bmr_acc_ppp_jit))[1] FALSE
4.5 Combine point events object and owin object
bmr_acc_ppp_final <- bmr_acc_ppp_jit[bmr_window]4.6 Kernel Density Estimation (KDE) Analysis
sigma_value <- 12000
kde_bmr_acc <- density(bmr_acc_ppp_final,
sigma = sigma_value,
edge = TRUE,
kernel = "gaussian") plot(kde_bmr_acc, main = "KDE of Accident Data")
plot(st_geometry(bmr_ad_data_utm), add = TRUE, border = "black", lwd = 2)
From the map, it is clear that central Bangkok and parts of Samut Prakan exhibit the highest density of car accidents, indicating that accidents are more concentrated in these urban areas. In contrast, the outer regions such as Nakhon Pathom, Pathum Thani, and Nonthaburi show much lower accident densities.
4.7 G Function
Null Hypothesis (H0): The spatial distribution of car accidents follows a completely spatially random (CSR) pattern, meaning that accidents occur independently and uniformly over the study area.
Alternative Hypothesis (H1): The spatial distribution of car accidents is not random and exhibits clustering or dispersion, implying that accidents are more likely to occur near each other or further apart than expected under CSR.
G_CK = Gest(bmr_acc_ppp_final, correction = "border")plot(G_CK, xlim=c(0,500))
The plot shows that the observed G function (black line) rises more quickly than the expected Poisson function (red dashed line), indicating that car accidents are spatially clustered.
5 NKDE analysis
Based on the KDE analysis, we found that Bangkok has the highest concentration of car accidents. Therefore, we refine the NKDE (Network Kernel Density Estimation) analysis to focus specifically on Bangkok’s road network, allowing for a more precise identification of accident hotspots along key roads and intersections.
5.1 Data select
Lots of side roads in the road data may influence the accuracy of the NKDE results, so here we filter the dataset to include only major roads.
main_rd <- tr %>%
filter(highway %in% c("motorway", "trunk", "primary", "secondary"))Here we select bangkok city boundary.
selected_cities <- c("Bangkok")
bangkok_city <- bmr_ad_data_utm %>%
filter(ADM1_EN %in% selected_cities)Then we select the main roads within Bangkok by intersecting the road and city boundary datasets.
main_rd_format <- st_transform(main_rd, crs = 32647)main_bangkok_roads <- st_intersection(main_rd_format, bangkok_city)saveRDS(main_bangkok_roads, "data/geospatial/main_bangkok_roads2.rds")main_bangkok_roads <- readRDS("data/geospatial/main_bangkok_roads2.rds")Filters the accident data to include only records from the Bangkok province.
bangkok_acc <- bmr_acc_data_utm %>%
filter(province_en %in% "Bangkok")5.2 Transfer to linestring
main_bangkok_roads <- main_bangkok_roads %>%
filter(st_geometry_type(main_bangkok_roads) %in% c("LINESTRING", "MULTILINESTRING"))
main_bangkok_roads <- st_cast(main_bangkok_roads, "LINESTRING", group_or_split = TRUE)5.3 Generate lixels
lixels_bangkok <- lixelize_lines(main_bangkok_roads,
10000,
mindist = 5000)
samples_bangkok <- lines_center(lixels_bangkok)5.4 Perform NKDE
nkde_result_bangkok <- nkde(
lines = lixels_bangkok,
events = bangkok_acc,
w = rep(1, nrow(bangkok_acc)),
samples = samples_bangkok,
kernel_name = "quartic",
bw = 500,
div = "bw",
method = "simple",
grid_shape = c(200, 200),
verbose = TRUE
)
saveRDS(nkde_result_bangkok, "data/aspatical/nkde_result_bangkok.rds")nkde_result_bangkok <- readRDS("data/aspatical/nkde_result_bangkok.rds")
head(nkde_result_bangkok, 10) [1] 0.000000e+00 0.000000e+00 0.000000e+00 2.435628e-06 0.000000e+00
[6] 0.000000e+00 3.205377e-06 0.000000e+00 2.432477e-06 0.000000e+00
5.5 Visualize NKDE results
samples_bangkok$density <- nkde_result_bangkok
lixels_bangkok$density <- nkde_result_bangkoksamples_bangkok$density <- samples_bangkok$density * 10000
lixels_bangkok$density <- lixels_bangkok$density * 10000tmap_mode('view')
tm_shape(lixels_bangkok) +
tm_lines(col = "density", palette = "YlOrRd", title.col = "Density", lwd = 2) +
tm_shape(bangkok_acc) +
tm_dots(size = 0.1, col = "blue", alpha = 0.5, title = "Accidents")